Search results for "Conjugate prior"

showing 4 items of 4 documents

Solving two‐armed Bernoulli bandit problems using a Bayesian learning automaton

2010

PurposeThe two‐armed Bernoulli bandit (TABB) problem is a classical optimization problem where an agent sequentially pulls one of two arms attached to a gambling machine, with each pull resulting either in a reward or a penalty. The reward probabilities of each arm are unknown, and thus one must balance between exploiting existing knowledge about the arms, and obtaining new information. The purpose of this paper is to report research into a completely new family of solution schemes for the TABB problem: the Bayesian learning automaton (BLA) family.Design/methodology/approachAlthough computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. B…

Bayesian statisticsMathematical optimizationOptimization problemGeneral Computer ScienceComputer scienceBayesian probabilityAutomata theoryBayesian inferenceConjugate priorAutomatonOptimal decisionInternational Journal of Intelligent Computing and Cybernetics

researchProduct

Generalized Bayesian Pursuit: A Novel Scheme for Multi-Armed Bernoulli Bandit Problems

2011

In the last decades, a myriad of approaches to the multi-armed bandit problem have appeared in several different fields. The current top performing algorithms from the field of Learning Automata reside in the Pursuit family, while UCB-Tuned and the e-greedy class of algorithms can be seen as state-of-the-art regret minimizing algorithms. Recently, however, the Bayesian Learning Automaton (BLA) outperformed all of these, and other schemes, in a wide range of experiments. Although seemingly incompatible, in this paper we integrate the foundational learning principles motivating the design of the BLA, with the principles of the so-called Generalized Pursuit algorithm (GPST), leading to the Gen…

Learning automatabusiness.industryComputer scienceBayesian probabilityMachine learningcomputer.software_genreBayesian inferenceConjugate priorField (computer science)Probability vectorPrinciples of learningArtificial intelligenceSet (psychology)businesscomputer

researchProduct

The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality

2016

The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA report…

Mathematical optimizationLearning automataDiscretizationbusiness.industryBayesian probability02 engineering and technologyMathematical proof01 natural sciencesConjugate priorField (computer science)010104 statistics & probabilityArtificial IntelligenceConvergence (routing)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingComputer Vision and Pattern RecognitionArtificial intelligence0101 mathematicsbusinessBeta distributionMathematics

researchProduct

Generalized Bayesian pursuit: A novel scheme for multi-armed Bernoulli bandit problems

2011

Published version of a chapter in the book: IFIP Advances in Information and Communication Technology. Also available from the publisher at: http;//dx.doi.org/10.1007/978-3-642-23960-1_16 In the last decades, a myriad of approaches to the multi-armed bandit problem have appeared in several different fields. The current top performing algorithms from the field of Learning Automata reside in the Pursuit family, while UCB-Tuned and the ε -greedy class of algorithms can be seen as state-of-the-art regret minimizing algorithms. Recently, however, the Bayesian Learning Automaton (BLA) outperformed all of these, and other schemes, in a wide range of experiments. Although seemingly incompatible, in…

VDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422VDP::Technology: 500::Information and communication technology: 550Bandit problems estimator algorithms general Bayesian pursuit algorithm Beta distribution conjugate priors

researchProduct